法国专利FR3028988A1 METHOD AND APPARATUS FOR REAL-TIME ADAPTIVE FILTERING OF BURNED DISPARITY OR DEPTH IMAGES

专利PDF首页>>法国专利

专利附录

专利说明

权利要求

类似技术

同族专利

引用文献

法律状态

优先权

专利摘要:
A method and apparatus for filtering the aberrations of disparity or depth images by an adaptive approach are described. The method makes it possible to locally filter the points that do not exhibit spatial coherence in their 3D neighborhood, according to a criterion derived from a geometrical reality of the transformations performed on the light signals. Advantageously, the noise filtering method can be applied to a dense image of depth or to a dense image of disparity.
公开号:FR3028988A1
申请号:FR1461260
申请日:2014-11-20
公开日:2016-05-27
发明作者:Mohamed Chaouch
申请人:Commissariat a lEnergie Atomique CEA；Commissariat a lEnergie Atomique et aux Energies Alternatives CEA；
IPC主号:

专利说明:

[0001] FIELD OF THE INVENTION The invention relates to the field of image processing and computer vision, and in particular the processing of disparity images. BACKGROUND OF THE INVENTION or noisy depth. STATE OF THE ART Scene analysis in images (such as image segmentation, background subtraction, automatic object recognition, multi-class detection) is a field widely explored in the literature, primarily for single sensor images (2D). Taking advantage of more recent advances in 3D perception, scene analysis also attempts to take advantage of depth information since an object is not only a visual unit consistent in terms of color and / or texture, but is also a spatially compact unit. There are several systems of 3D perception: - The equipment type 3D scanner or time-of-flight cameras (TOF or Time of Flight according to known anglicism). This type of 3D sensor provides a depth image where each pixel corresponds to the distance between a point of the scene and a specific point. The depth images obtained are generally quite precise, but they nevertheless include aberrations (for example, "speckle" for TOFs). Their cost is high, from one to several thousand euros, limiting their use to applications with little cost constraints. In addition, many of these 3D sensors are unusable in real-time applications because of the low frequency of the images. 2 3028988 - Stereoscopic systems, generally composed of a set of cameras and / or projectors, associated with specific treatments (such as a disparity calculation). The advantage is a much lower cost because of the price of standard cameras, or even the cameras that may already be present for other applications (for example the reversing camera function). On the other hand, these images are noisier (sensitivity to lighting conditions, problem with slightly textured surfaces, etc.), and the depth image deduced from the disparity map is not dense. The non-linear transformation (depth map disparity map) has a non-uniform information density in the depth map. Typically, data close to the camera will be denser, and data at the object boundary may be inaccurate. The quality of the depth image or the disparity image has a significant impact on the performance of the treatments performed on this image. In the case of stereoscopic images, significant errors in the depth image penalize the treatments performed. Thus, 3D scene analysis systems (e.g., scene segmentation) are either expensive or degraded by the errors present in the depth map. A filtering of the data related to the depth can be realized on the map of disparity. Aberrant errors are conventionally processed by median type filters. The only parameter of this filter is the size 25 (see the shape) of the support. Type 3 * 3 or 5 * 5 square supports are typically used. If the ability to remove noise increases with the size of the support, it is nevertheless accompanied by the suppression of details, as well as the potential displacement of the contours in the presence of noise. In the context of segmentation, this may result in imprecise segmentation, and it should be noted that this impact is not uniform on the depth image or on the disparity image. On the other hand, using a small medium decreases the filtering ability. If the noise is statistically significant, its filtering will be only partial. Thus, the choice of the size of the filter manages the compromise between the aberration suppression and the deformation of the image. This choice is left to the discretion of the user, and there is no method for the automatic determination of an "optimal" value. In the article titled "Rapid 3D object detection and modeling using data imaging camera for heavy equipment operation" of Son, Kim & Choi appeared in the journal "Automation in Construction" Vol. 19, pp. 898-906, Elsevier, 2010, the authors present a 3D scene segmentation system, consisting of a flight time camera and software processing including successive steps of noise reduction in the depth images. , subtraction of soil elements, segmentation of objects and creation of volumes encompassing objects. The limitations of such an approach are that the system requires a time-of-flight camera which is an expensive device, and the filtering operations are adapted to the type of noise related to the sensor. The filtering uses fixed supports, without considering the local characteristics of the signal: medium difference filter (ADV) of size 3 * 3 associated with a fixed threshold of 0.6 to filter the outliers 25 of the "dropout" type (wave not received on the sensor), as well as a 3 * 3 median filter to correct the speckle noise. Moreover, as previously mentioned, a fixed support size and a fixed threshold do not make it possible to optimize the filtering / preservation compromise of the signal as a function of the real and local characteristics of the signal, particularly those related to the geometry of the signal. a 3D approach. Finally, the global approach of the segmentation uses a dense 3D mesh allowing a fine segmentation, but whose calculation time, of the order of the second, remains long.
[0002] In the patent application EP 2541496 (A2) "Method, support and apparatus for filtering depth sound using the depth information" of Samsung Electronics, a method of filtering a depth noise can perform spatial or temporal filtering. according to the depth information. In order to perform spatial filtering, method 10 may determine a spatial filter characteristic based on the depth information. Similarly, in order to perform time filtering, the method may determine a number of reference frames based on the depth information. Although this solution adapts the size and the coefficient of the filter to be applied according to the depth of the region to be treated, there are disadvantages which among others are that the characteristics of the filter do not take into account the distance of objects to optical center of the camera. In patent application WO 2013079602 (A1) "Spatiotemporal Disparity Map Smoothing by Multilateral Joint Filtering" by Kauff P. et al. a filter structure for filtering a disparity map D (p, t0) comprises a first filter, a second filter and a filter selector. The first filter is for filtering a considered section of the disparity map according to a first measure of central tendency. The second filter is for filtering the section 25 considered disparity cards according to a second measure of central tendency. The filter selector is provided for selecting the first filter or the second filter to filter the relevant section of the disparity map, the selection being based on at least one local property of the considered section. This approach that operates just on the disparity map is dependent on the choice of a fixed threshold for the discrimination filter that is not consistent with the physical or geometric reality. Thus, there is no solution in the prior art that makes it possible to improve the quality of a depth image, and consequently that of the subsequent processes, while maintaining a low system cost. Moreover, there is no known approach that takes into account the geometrical reality of the operations performed on the original light signal.
[0003] There is then the need for a solution that overcomes the disadvantages of known approaches. The present invention meets this need. SUMMARY OF THE INVENTION An object of the present invention is to provide a device and method for filtering the aberrations of disparity or depth images by an adaptive approach. The proposed approach makes it possible to filter locally the points that do not have a spatial coherence in their 3D neighborhood, according to a criterion derived from a geometric reality of the transformations performed on the light signals.
[0004] The adaptive filtering of the present invention improves the existing methods, by stabilizing over the entire 3D space, the compromise filter capacity / detail preservations, which compromise is adjusted to a value that can be specified by the user. The proposed method of noise filtering performed on a dense depth image or on a dense disparity image makes it possible to improve the quality and performance of subsequent treatments such as the automatic segmentation of an observed scene; ie the 3028988 automatic decomposition of the scene into several constituent elements. The device of the invention can be part of a processing chain either as a post-processing of noisy depth images or noisy disparity images, and / or as a pretreatment for scene analysis applications. using a depth image or a disparity image. Advantageously, the proposed solution is characterized by: an adapted filtering of the 3D data, taking into account the spatial coherence of the data and the geometrical reality of the operations performed on the original signal (the light waves); - a controlled cost of the system, through the use of a stereoscopic sensor; an approach requiring low computing resources and allowing real-time deployment on standard and inexpensive computing architectures. Advantageously, the filtering parameters are optimized locally, taking into account the geometrical realities of the transformations of the light signal.
[0005] Thus, the compromise filtering capacity versus the preservation of details is managed automatically, adapting to spatial localities (spatial standardization), and depending only on an intuitive parameter left to the user's choice and valid throughout the area. 3D considered. Advantageously, the characteristics of the filter of the present invention depend not only on the depth but also on the distance of the objects from the optical center of the camera. More generally, the adaptations of the filter parameters are not based on empirical (in this case linear) equations but are based on the realities of geometric transformations. The filter parameters are also dynamically dependent on a criterion of spatial coherence of the data. Advantageously, the filter is not applied directly to the data to output a filtered image, but the proposed method makes it possible to produce an image of the pixels which must be filtered and which are then processed separately. Thus, the pixels considered valid are not modified at all. The present invention is advantageous in any real-time application for analyzing all or part of a 3D scene and using as input a disparity image or a depth image. All actors involved in video surveillance, video protection or video assistance, and those whose application involves feedback on the content of a scene will benefit the process of the invention. To obtain the desired results, a method and a device are proposed. In particular, a method for filtering an initial 3D image comprises the steps of: defining a local analysis area for each 3D point associated with each pixel of the initial image; generating a spatial coherence image for all the 3D points associated with all the pixels of the initial 3D image, from a measured spatial coherence value for each 3D point on the local analysis zone ; generate a geometrical reality image for all the 3D points associated with all the pixels of the initial 3D image, based on a geometrical reality value measured for each 3D point on the analysis zone; local ; generating a binary image from spatial coherence and geometrical reality images in which each point of the binary image is classified into a scene point or a noise point according to the values of spatial coherence and geometric reality obtained for this purpose. point; and - combining the binary image with the initial 3D image to obtain a denoised image.
[0006] Advantageously, the local analysis area - S (P (u, v)) - consists of a 3D volume of fixed size, centered on the coordinates P (u, v) of a 3D point associated with a pixel. In one embodiment, the step of measuring a spatial coherence value - Cs (u, v) - for a 3D point comprises the steps of determining the set of pixels of the initial image including the associated 3D points. are contained in the local analysis area for said 3D point; and defining a spatial coherence value for said 3D point as a function of the result.
[0007] In one embodiment, the step of measuring a geometrical reality value - Rg (u, v) - for a pixel associated with a 3D point, comprises the steps of projecting the local analysis area into an empty scene; determine all visible 3D points in the local analysis area of the empty scene; and defining a geometrical reality value for said pixel as a function of the result. In one embodiment, the step of generating a binary image comprises the steps of generating for each 3D point a filtering value from the values of spatial coherence and geometric reality; compare the obtained filter value with a threshold value; 9 3028988 classify the 3D point into a scene point or a noise point according to the result of the comparison; and generate an image of all the scene and noise points. In one embodiment, the initial image is a disparity image. In an implementation variant, the initial image is a depth image. According to the embodiments, the local analysis zone is chosen from a group comprising representations of the sphere, cube, box or cylinder type, or 3D mesh-type surface representations or voxel-type voxel representations or algebraic representations. In one embodiment, the geometric reality value is pre-calculated. The invention also covers a device for filtering a noisy initial image which comprises means for carrying out the steps of the claimed method. The invention may operate in the form of a computer program product which includes code instructions for performing the claimed process steps when the program is run on a computer. DESCRIPTION OF THE FIGURES Various aspects and advantages of the invention will appear in support of the description of a preferred embodiment of the invention, but not limiting, with reference to the figures below: FIG. method for obtaining a denoised image according to one embodiment of the invention; FIG. 2 illustrates the steps of the method for obtaining a spatial coherence image according to an embodiment of the invention; FIG. 3 illustrates the steps of the method for obtaining a geometrical reality image according to one embodiment of the invention; Figure 4 illustrates the steps of the method for obtaining a decision image according to an embodiment of the invention; FIG. 5 illustrates the functional blocks of the filtering device of the invention according to one embodiment; Figure 6 illustrates a projection of six local supports in one embodiment of the invention; FIGS. 7a to 7f illustrate the images obtained at the different stages of the filtering method of FIG. 1 according to one embodiment of the invention.
[0008] DETAILED DESCRIPTION OF THE INVENTION Reference is made to FIG. 1 which generally illustrates the steps of the method (100) of the invention making it possible to obtain a denoised image. The process begins when an initial image representing a scene is to be denoised (102). The initial 3D image can be obtained according to stereoscopic vision and 3D data processing techniques, where a scene is represented by a pair of images taken from different angles. Advantageously, the method (100) can be applied to an initial image of disparity D or depth P.
[0009] It is known that to calculate the disparity of a point of a scene, it is necessary to have the coordinates of its two projections in the left and right images. To do this, matching algorithms are used and aim to find, for a given point in an image, its corresponding point in the other image. Once the disparities of the points of the scene have been calculated, a corresponding cloud of points of the scene is realized. It is also known that the disparity 'of a point of a scene and its depth' z 'with respect to the camera are related. This link is defined by the following equation (1): z * d = B * f [Eql] 10 'B' which is known as the baseline 'or distance between the two optical centers of the cameras and T which is the distance focal length (the same for both cameras) having constant values, a variation of disparity 'd' depends directly on a variation of the distance 'z' between a point and the cameras.
[0010] The coordinates (x, y, z) of a point of a scene corresponding to a pixel of coordinates (u, y) and of disparity 'd' are then calculated according to the following equations (2,3,4): z = B * f / d [Eq2] x = (u-u0) * z / f [Eq3] 20 y = (v-v0) * z / f [Eq4] where (u0, v0) corresponds to the coordinates of the projection of the optical center in the image. Similarly, there is a relationship between the area of the apparent surface of an object of a scene in the image and the area of the actual surface of the apparent part of the object. A large variation in the distance of the object from the optical center of the camera involves a significant change in the area of the apparent surface of the object in the disparity images. This finding also applies to the 12 3028988 depth images. Also, in the case of a de-noise using a fixed size filter as in the prior art, for example a median filter, the change in appearance being too important, the process will ensure its filter function in a limited area of the image, however 5 it will fail on the rest of the image. Also, advantageously, the present invention provides a new filtering method adapted to 3D data that uses optimized thresholding. The method takes into account the spatial coherence of the data and the geometrical reality of the operations carried out on the signal. To do this, two new measures are introduced: spatial coherence - Cs - and geometrical reality - Rg -. In the remainder of the description, the following notations are adopted: for the depth images: R (u, v) is a pixel of 15 coordinates u and v in the depth image, and P (u, v) its associated 3D point of coordinates (x, y, z); for the disparity images: D (u, v) is a pixel of coordinates u and v in the disparity image, and P (u, v) is its associated 3D point of coordinates (x, y, z) calculated according to equations 20 (2,3,4). Returning to FIG. 1, after receiving the initial image of disparity or depth, the method makes it possible to generate two new images from the initial image, a first image called a spatial coherence image (104) and a second image said geometric reality image (106). Then, the method makes it possible to combine the spatial coherence and geometric reality images to generate (108) a third image, called the decision image as will be detailed with reference to FIG. 4.
[0011] 13 302 89 88 In a next step, the decision image is combined with the initial image to generate (110) a noiseless image of the analyzed scene. The denoised image can then be used in a scene analysis method such as image segmentation, background subtraction, automatic object recognition or multi-class detection. For example, the present invention combined with a 3D segmentation method, which decomposes a scene into separate real objects, provides, for example, localized obstacle detection.
[0012] Advantageously, the method of the invention which generates a better quality denoised image makes it possible to improve the calculation time of a segmentation operation, which is of the order of one hundredth (1/100) of a second. The denoised image can also advantageously be used for a simple display of the image of disparity or depth, facilitating reading comfort and easy interpretation by a human user. FIGS. 7a to 7f illustrate the images obtained at the different stages of the filtering method of FIG. 1 according to one embodiment of the invention.
[0013] Figure 2 illustrates the steps of the method (104) of Figure 1, for generating a spatial coherence image in an embodiment of the invention. The initial image can be a disparity image or in an implementation variant, be a depth image.
[0014] In a first step (202), the method makes it possible to select a local support of volume 3D - S (P (u, v)) - of size 's' fixed and centered at a point P (u, v). The size 's' is the volumetric accuracy or granularity desired by a user for the elements of the scene to be analyzed.
[0015] 14 3028988 Different types of representations of the support 'S' can be adopted: - elementary representation of type: sphere, cube, box, cylinder; - surface representation type 3D mesh; Voxelic voxel representation; or - algebraic representation such implicit surfaces of type f (x, y, z) = 0. In the next step (204), the method makes it possible to determine all the points whose 3D projection is contained in the selected local medium S (P (u, v)). A spatial coherence measure is calculated in the next step (206) from the number of points counted, for each coordinate pixel (u, v), in depth or in disparity according to the implementation mode. Those skilled in the art will appreciate that the greater the number of points is raised around a pixel, the more there is spatial coherence, and conversely, few points around a pixel reveals a low spatial coherence, which can indicate that the pixel represents a noise. Thus, the spatial coherence criterion - Cs (u, v) - is constructed as a function gb (E) based on the set of pixels of the initial real image whose associated 3D points belong to the selected local medium, centered in P (u, v), such that: Cs (u, v) = ci) (E), where - E = {R (u, v) such that P (u, v) e S (P (u, v))} in the case of a depth image; and 25 - E = {D (u, v) such that D (u, v) e S (P (u, v))} in the case of a disparity image.
[0016] In a preferred embodiment, the spatial coherence criterion is defined according to the following equation: Cs (u, y) = cp (E) = Card (E) [Eq5], where the function 'Card' designates the cardinal, that is, the size of E.
[0017] Once the spatial coherence values calculated for all the pixels of the initial image, the method makes it possible to generate (208) a spatial coherence image. FIG. 3 illustrates the steps of the method (106) of FIG. 1 making it possible to generate a geometrical reality image in one embodiment of the invention, from the initial image which may be a disparity image or in a variant of implementation, to be an image of depth. In a first step (302), the method makes it possible to select a local support of volume 3D - S (P (u, v)) - of size 's' fixed and centered at a point P (u, v). In a preferred embodiment, the support selected for processes (104) and (106) is the same. The method then allows (304) to project for each pixel the local medium in an empty scene. The projection step is performed for all disparity or depth values located at any pixel position (u, v) of the 2D image, and in a predefined functional range, with a functional disparity granularity (or respectively of depth) defined. Thus, the projections correspond to geometrical realities of the "2D to 3D" transformation. They remain valid throughout the operation of the system, 25 as long as the optical parameters remain unchanged (internal calibration of each camera, harmonization of the stereoscopic torque, height and orientation of the stereo head in its environment).
[0018] The next step (306) makes it possible to determine the number of points appearing in the projected medium, that is to say all the points visible in the empty scene, in order to make it possible to calculate at the step next (310) a measure of the geometrical reality - Rg (u, v) - for each coordinate pixel (u, v), in depth or in disparity according to the implementation mode. Thus, the criterion of geometrical reality - Rg (u, v) - is constructed as a function based on the set of lit pixels, that is to say those which have disparities or projections that are defined, associated with 10 points visible from the local support. In a preferred embodiment, the geometrical reality criterion Rg (u, v) is defined as being the cardinal of this set, and corresponds to the area of the apparent surface of the local support S (P (u, v)) in the projection image of the support in the empty scene.
[0019] By way of illustration, FIG. 6 shows, for a sphere-type support, six projections for points of positions (u, v) and of different disparities. The example makes it possible to show that the area of the apparent surface of each local support represents the geometrical reality of the corresponding coordinate point (u, v).
[0020] Two implementations of the geometric reality criterion are possible: either, a complete pre-calculation is done for any depth or for any disparity and the result is stored. This is the preferred implementation to promote a low computation time of a processing chain, to the detriment of a required memory space; - Or, a calculation is made at each projection. This is the preferred implementation if one wishes to promote a small memory footprint, at the expense of increased computing time.
[0021] It will be appreciated by those skilled in the art that alternative implementations are possible, such as, for example, pre-calculating with reduced compression and storage. This variant requires a decompression calculation for replaying the data.
[0022] Once the geometrical reality has been calculated for all the pixels of the initial image, the method makes it possible to generate (312) an image of geometrical reality. Figure 4 illustrates the steps of the method (108) of Figure 1 for generating a decision image in one embodiment of the invention. The process begins when the spatial coherence and geometric reality images have been generated. In a first step (402), the method makes it possible to define a filtering criterion based on the two criteria of spatial coherence 'Cs' and geometric reality `Rg'. The filtering criterion will make it possible to distinguish whether a pixel is a point in the scene or is a noise point. The filtering criterion will be calculated for each coordinate pixel (u, v) of the depth (or disparity) image. The filtering criterion F (u, v) is given by a function F combining the spatial coherence Cs (u, v) and the geometrical reality Rg (u, v) of the pixel, and is denoted by F (u, v) = F (Cs (u, v), Rg (u, v)) In an implementation, the function is chosen as the ratio of Cs and a power of Rg according to the following equation: F (u, v) - Cs (u, v) / (Rg (u, v)) "[Eq6] where the parameter a is used to manage the compromise between the two criteria of spatial coherence and geometrical reality Thus, plus a is high, plus the geometrical reality will be privileged in the criterion The specification of a is parameterizable by the user, making it possible to adapt it to the objectives of the application.
[0023] By default, the singular case a = 1 is nevertheless intrinsically relevant, and makes it possible to set the filtering criterion F as a filling ratio, fixing the percentage of pixels lit in a coherent zone.
[0024] In a next step (404), the method makes it possible to compare the value of the filter criterion of each point (u, v) with a threshold value. If the value of the criterion is lower than the defined threshold (no branch), the point is classified as a noise point (406). If the value of the criterion is greater than the defined threshold (yes branch), the point is classified as a point 10 belonging to the scene (408). The next step (410) is to generate a decision image 'Fa' from all the points classified 'scene' or 'noise'. The decision image is a binary image that represents a mask of the initial data (of disparity or depth), separating the set of 15 correct estimated data where the point is set to-1 'of the set of estimated data as noise where the point is set to '0'. When a decision image is generated, the overall method (100) makes it possible to generate a denoised image (step 110 of FIG. 1) by a combination of the original image (of disparity D (u, v) or of depth 20 R (u, v)) and the decision image F. The combination of the two images is then dependent on the application considered. In a particular implementation, the denoised image is defined according to the following equations: Df (u, V) = D (u, v) * F5 (u, v) + (1 - F5 (u, v)) * d (U, V) in the case of an initial disparity image; Rf (u, v) = R (u, v) * F5 (u, v) + (1 - F5 (u, v)) * R (U, V) in the case of an initial depth image, where D (u, v) and RR (u, v) respectively denote a local estimate of disparity (D) or depth (R) data. Thus, advantageously, the method of the invention allows the filtered image either to preserve the original value of the pixel, or to replace it with an estimate. In a particular implementation, the estimation function takes a fixed value such that: Ê D or R (U, V) = K (fixed value). This implementation is useful for isolating pixels from the image (disparity or depth) by assigning them to a specifically identifiable `K 'value. Such a case concerns applications where it is preferred not to consider the initially noisy pixels. In a typical implementation, K = 0 or K = 2N-1 for an N-bit resolved signal, so as not to interfere with the range of possible pixel values. If K = 0, the values of the output pixels are: Df (U, V) = D (u, v) * F5 (u, v) for an initial disparity image; and Rf (U, V) = R (u, v) * F5 (u, v) for an initial depth image.
[0025] In an implementation variant, the estimation function D D or R (u, v) may be a local interpolation of the data D (u, v) or R (u, v) present (not noisy) in a neighborhood of (u, v). A bilinear interpolation may be employed, or a weighted median type nonlinear operation. This approach is relevant for obtaining a dense and "smooth" filtered image, for example for visualization or compression purposes, in fact the atypical values such as a discriminant fixed K being not very compatible with the entropy coding.
[0026] FIG. 5 schematically illustrates the functional blocks of an implementation of the device (500) of the invention for implementing the method of FIG. 1. The device comprises a block (502) making it possible to produce an initial 3D image of disparity or depth of a scene. In one implementation, the scene is viewed from a calibrated stereoscopic sensor with a controlled cost, and a disparity image (representing the 3D information) is constructed from a pair of rectified images. The block (502) is coupled to a first image generation block (504) for generating an image of the spatial coherence and to a second image generation block for generating an image of the geometric reality. Blocks 502 and 504 include means for carrying out the steps described with reference to FIGS. 2 and 3. The output of blocks 502 and 504 is coupled to a third image generation block (508) to generate an image. filtering. The output of block 508 is coupled to a fourth image generation block (510) to generate a decision image. Blocks 508 and 510 comprise means making it possible to implement the steps described with reference to FIG. 4.
[0027] The output of the block 510 is combined with the output of the block 502 to enter a last image generator block (512) to generate an image denoised according to the principles described with reference to step 110. Thus the device 500 allows filtering on a disparity (or depth) image to suppress natural noise such as rain, glare, dust, or sensor noise, or noise related to disparity calculations. The present invention can be combined with a 3D segmentation method of the scene. The denoised image (resulting from the device 500) is transformed into a cloud of points which are subsequently quantized in a 3D grid composed of 1x hx p cells. To disconnect the obstacles between them, generally connected by the ground, a filter is applied which makes it possible to remove the cells of the grid containing 3D points of the ground. The remaining cells are subsequently spatially segmented into related parts using a segmentation method known from the state of the art. For example, one method is to iteratively aggregate cells by connectivity. The removal of points representing the noise by application of the filter of the invention has a favorable impact on the performance of the 3D segmentation. Indeed, the interest of the filter for the segmentation is that the obstacles are often connected by noises points. In this case, it is difficult to spatially segment the different obstacles. Moreover, the interest of quantification is that obstacles are often partially reconstructed on the disparity image. It is therefore difficult from the cloud of points resulting from connecting the different parts of the same obstacle. Finally, the advantage of removing the corresponding cells from the ground is that the obstacles are often connected by the ground. It is therefore wise to break these connections. Those skilled in the art will understand that the indicated example of a 3D obstacle detector is only one example for a scene analysis to benefit from the denoising function of the disparity image proposed by the present invention. Nevertheless, the use of filtering, as proposed in the invention, can not be limited to the search for obstacles by segmentation. It relates to any real-time scene analysis system from a noisy depth image or a noisy disparity image. The present invention can be implemented from hardware and software elements. The software elements may be available as a computer program product on a computer-readable medium, which may be electronic, magnetic, optical or electromagnetic. 23

权利要求:
Claims (11)
[0001]
REVENDICATIONS1. A method for filtering an initial 3D image, comprising the steps of: - defining a local analysis area for each 3D point associated with each pixel of the initial image; generating a spatial coherence image for all the 3D points associated with all the pixels of the initial 3D image, from a measured spatial coherence value for each 3D point on the local analysis zone; generating a geometrical reality image for all the 3D points associated with all the pixels of the initial 3D image, from a measured geometrical reality value for each 3D point on the local analysis zone; generating a binary image from spatial coherence and geometrical reality images in which each point of the binary image is classified into a scene point or a noise point according to the values of spatial coherence and geometric reality obtained for this point ; and - combining the binary image with the initial 3D image to obtain a denoised image.
[0002]
2. The method according to claim 1, wherein the step of defining a local analysis area - S (P (u, v)) - consists in defining a 3D volume of fixed size, centered on the coordinates P (u, v) a 3D point associated with a pixel. 24 3028988
[0003]
3. The method of claim 1 or 2 wherein the step of measuring a spatial coherence value - Cs (u, v) - for a 3D point, comprises the steps of: - determining all the pixels of the initial image whose associated 3D points are contained in the local analysis area for said 3D point; and - defining a spatial coherence value for said 3D point as a function of the result. 10
[0004]
4. The method according to any one of claims 1 to 3 wherein the step of measuring a geometrical reality value - Rg (u, v) - for a pixel associated with a 3D point, comprises the steps of: - projecting the local analysis area in an empty scene; Determining all the visible 3D points in the local analysis zone of the empty scene; and - defining a geometrical reality value for said pixel as a function of the result. 20
[0005]
5. The method according to claim 1, wherein the step of generating a binary image comprises the steps of: generating for each 3D point a filtering value based on the values of spatial coherence and reality geometric ; Comparing the obtained filtering value with a threshold value; - classify the 3D point in a scene point or noise point according to the result of the comparison; and 3028988 - generate an image of all the scene and noise points.
[0006]
The method of any one of claims 1 to 5 wherein the initial image is a disparity image.
[0007]
The method of any one of claims 1 to 5 wherein the initial image is a depth image. 10
[0008]
8. The method according to any one of claims 1 to 7 wherein the area of local analysis is selected from a group comprising representations of sphere, cube, box, cylinder or surface representations of the 3D mesh type or representations. Voxel voxelics or algebraic representations.
[0009]
9. The method according to any one of claims 1 to 8 wherein the geometrical reality value is pre-calculated. 20
[0010]
10. A device for filtering an initial image, the device comprising means for implementing the steps of the method according to any one of claims 1 to 9.
[0011]
A computer program product, said computer program comprising code instructions for performing the steps of the method according to any of claims 1 to 9, when said program is run on a computer.

类似技术:

公开号 | 公开日 | 专利标题

EP3221841B1|2018-08-29|Method and device for the real-time adaptive filtering of noisy depth or disparity images

EP2304686B1|2018-04-18|Method and device for filling in the zones of occultation of a map of depth or of disparities estimated on the basis of at least two images

Artusi et al.2011|A survey of specularity removal methods

KR20100085675A|2010-07-29|Method of filtering depth noise using depth information and apparatus for enabling the method

FR3011368A1|2015-04-03|METHOD AND DEVICE FOR REINFORCING THE SHAPE OF THE EDGES FOR VISUAL IMPROVEMENT OF THE RENDER BASED ON DEPTH IMAGES OF A THREE-DIMENSIONAL VIDEO STREAM

EP3114831B1|2021-06-09|Optimised video denoising for heterogeneous multisensor system

EP0751482B1|2003-05-02|Method and apparatus for temporal filtering of noise in an image sequence

EP2909671B1|2016-12-07|Method for designing a single-path imager able to estimate the depth of field

EP0961227A1|1999-12-01|Method of detecting the relative depth between two objects in a scene from a pair of images taken at different views

EP0410826B1|1994-05-04|Iterative motion estimation process, between a reference image and a current image, and device for canying out the process

EP2943935B1|2017-03-08|Estimation of the movement of an image

WO2020157733A1|2020-08-06|Dynamic three-dimensional imaging method

EP3384462B1|2020-05-06|Method for characterising a scene by calculating the 3d orientation

Tsiotsios et al.2016|Effective backscatter approximation for photometry in murky water

EP1095358B1|2003-10-01|Method for modelling three-dimensional objects or scenes

CA3105372A1|2020-01-02|Processing of impulse noise in a video sequence

FR3073311A1|2019-05-10|METHOD FOR ESTIMATING THE INSTALLATION OF A CAMERA IN THE REFERENTIAL OF A THREE-DIMENSIONAL SCENE, DEVICE, INCREASED REALITY SYSTEM, AND COMPUTER PROGRAM

FR3097974A1|2021-01-01|PASSIVE TELEMETRY METHOD AND DEVICE BY IMAGE PROCESSING AND USE OF THREE-DIMENSIONAL MODELS

FR3066633A1|2018-11-23|METHOD FOR DEFLOWING AN IMAGE

WO2020136180A1|2020-07-02|Method for segmenting an image

Baeza2016|Mathematical Methods in Image Processing and Computer Vision

FR3092423A1|2020-08-07|IMAGE SAILLANCE MAPPING DURING ARTIFICIAL INTELLIGENCE IMAGE CLASSIFICATION

EP3072110A1|2016-09-28|Method for estimating the movement of an object

FR3075438A1|2019-06-21|HYBRID SYNTHESIS METHOD OF IMAGES

FR3009471A1|2015-02-06|METHOD FOR PROCESSING AN IMAGE SEQUENCE, CORRESPONDING COMPUTER PROGRAM AND PROCESSING DEVICE

同族专利:

公开号 | 公开日

US20170337665A1|2017-11-23|

CN107004256B|2020-10-27|

JP2017535884A|2017-11-30|

US10395343B2|2019-08-27|

WO2016079179A1|2016-05-26|

EP3221841B1|2018-08-29|

EP3221841A1|2017-09-27|

FR3028988B1|2018-01-19|

CN107004256A|2017-08-01|

JP6646667B2|2020-02-14|

引用文献:

公开号 | 申请日 | 公开日 | 申请人 | 专利标题

US20120263353A1|2009-12-25|2012-10-18|Honda Motor Co., Ltd.|Image processing apparatus, image processing method, computer program, and movable body|

DE10317367B4|2003-04-15|2007-01-11|Siemens Ag|Method of performing digital subtraction angiography using native volume data sets|

US8139142B2|2006-06-01|2012-03-20|Microsoft Corporation|Video manipulation of red, green, blue, distance data including segmentation, up-sampling, and background substitution techniques|

JP5041458B2|2006-02-09|2012-10-03|本田技研工業株式会社|Device for detecting three-dimensional objects|

EP2584494A3|2006-08-03|2015-02-11|Alterface S.A.|Method and devicefor identifying and extractingimages of multiple users, and for recognizing user gestures|

WO2008041167A2|2006-10-02|2008-04-10|Koninklijke Philips Electronics N.V.|Method and filter for recovery of disparities in a video stream|

KR101526866B1|2009-01-21|2015-06-10|삼성전자주식회사|Method of filtering depth noise using depth information and apparatus for enabling the method|

CN101640809B|2009-08-17|2010-11-03|浙江大学|Depth extraction method of merging motion information and geometric information|

US10095953B2|2009-11-11|2018-10-09|Disney Enterprises, Inc.|Depth modification for display applications|

US9858475B2|2010-05-14|2018-01-02|Intuitive Surgical Operations, Inc.|Method and system of hand segmentation and overlay using depth data|

JP2012120647A|2010-12-07|2012-06-28|Alpha Co|Posture detection system|

US8982117B2|2011-06-22|2015-03-17|Samsung Display Co., Ltd.|Display apparatus and method of displaying three-dimensional image using same|

EP2786580B1|2011-11-30|2015-12-16|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Spatio-temporal disparity-map smoothing by joint multilateral filtering|

US9779546B2|2012-05-04|2017-10-03|Intermec Ip Corp.|Volume dimensioning systems and methods|

CA2873218A1|2012-05-10|2013-11-14|President And Fellows Of Harvard College|Automated system and method for collecting data and classifying animal behavior|

US9811880B2|2012-11-09|2017-11-07|The Boeing Company|Backfilling points in a point cloud|

US9292927B2|2012-12-27|2016-03-22|Intel Corporation|Adaptive support windows for stereoscopic image correlation|

CA2820305A1|2013-07-04|2015-01-04|University Of New Brunswick|Systems and methods for generating and displaying stereoscopic image pairs of geographical areas|

US9530218B2|2014-04-04|2016-12-27|Hrl Laboratories, Llc|Method for classification and segmentation and forming 3D models from images|

US9383548B2|2014-06-11|2016-07-05|Olympus Corporation|Image sensor for depth estimation|

US10154241B2|2014-09-05|2018-12-11|Polight As|Depth map based perspective correction in digital photos|

JP2016091457A|2014-11-10|2016-05-23|富士通株式会社|Input device, fingertip-position detection method, and computer program for fingertip-position detection|

CN105812649B|2014-12-31|2019-03-29|联想有限公司|A kind of image capture method and device|

WO2016130116A1|2015-02-11|2016-08-18|Analogic Corporation|Three-dimensional object image generation|

CN105184780B|2015-08-26|2018-06-05|京东方科技集团股份有限公司|A kind of Forecasting Methodology and system of stereoscopic vision depth|US10304256B2|2016-12-13|2019-05-28|Indoor Reality Inc.|Point cloud cleaning method|

CN110400272B|2019-07-11|2021-06-18|Oppo广东移动通信有限公司|Depth data filtering method and device, electronic equipment and readable storage medium|

CN110378946B|2019-07-11|2021-10-01|Oppo广东移动通信有限公司|Depth map processing method and device and electronic equipment|

CN110415287B|2019-07-11|2021-08-13|Oppo广东移动通信有限公司|Depth map filtering method and device, electronic equipment and readable storage medium|

CN110782416A|2019-11-05|2020-02-11|北京深测科技有限公司|Drying method for three-dimensional point cloud data|

CN112116623B|2020-09-21|2021-04-23|推想医疗科技股份有限公司|Image segmentation method and device|

法律状态:
2015-11-30| PLFP| Fee payment|Year of fee payment: 2 |

2016-05-27| PLSC| Publication of the preliminary search report|Effective date: 20160527 |

2016-11-30| PLFP| Fee payment|Year of fee payment: 3 |

2017-11-30| PLFP| Fee payment|Year of fee payment: 4 |

2018-11-29| PLFP| Fee payment|Year of fee payment: 5 |

2019-11-29| CL| Concession to grant licences|Name of requester: ECO-COMPTEUR, FR Effective date: 20191025 |

2020-10-16| ST| Notification of lapse|Effective date: 20200906 |

优先权:

申请号 | 申请日 | 专利标题

FR1461260|2014-11-20|

FR1461260A|FR3028988B1|2014-11-20|2014-11-20|METHOD AND APPARATUS FOR REAL-TIME ADAPTIVE FILTERING OF BURNED DISPARITY OR DEPTH IMAGES|FR1461260A| FR3028988B1|2014-11-20|2014-11-20|METHOD AND APPARATUS FOR REAL-TIME ADAPTIVE FILTERING OF BURNED DISPARITY OR DEPTH IMAGES|

US15/524,217| US10395343B2|2014-11-20|2015-11-18|Method and device for the real-time adaptive filtering of noisy depth or disparity images|

CN201580063436.8A| CN107004256B|2014-11-20|2015-11-18|Method and apparatus for real-time adaptive filtering of noisy depth or parallax images|

EP15798392.5A| EP3221841B1|2014-11-20|2015-11-18|Method and device for the real-time adaptive filtering of noisy depth or disparity images|

JP2017527249A| JP6646667B2|2014-11-20|2015-11-18|Method and apparatus for real-time adaptive filtering of noisy depth or parallax images|

PCT/EP2015/076964| WO2016079179A1|2014-11-20|2015-11-18|Method and device for the real-time adaptive filtering of noisy depth or disparity images|

[返回顶部]